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In the evolutionary minority game, agents are allowed to evolve their strategies ("mutate") based 
on past experience. We explore the dependence of the system's global behavior on the response 
time and the mutation threshold of the agents. We find that the precise values of these parameters 
determine if the strategy distribution of the population has a U-shape, inverse-U-shape, or a W- 
shape. It is shown that in a free society (market), highly adaptive agents (with short response times) 
perform best. In addition, "patient" agents (with high mutation threshold) outperform "nervous" 
ones. 



A problem of wide interest in biological and socio- 
economic systems is that of an evolving population in 
which individual agents adapt their behavior according 
to past experience. The Minority Game (MG) is one of 
the most studied models of such complex systems (see 
e.g., [1-20] and references therein). In this model, a pop- 
ulation of N agents with limited information and capabil- 
ities repeatedly compete for a limited global resource, or 
to be in the minority. In financial markets for instance, 
more sellers than buyers implies lower prices, and it is 
therefore better for a trader to be in a minority group of 
buyers. Predators foraging for food will do better if they 
hunt in areas with fewer competitors. Rush-hour drivers, 
facing the choice between two alternative routes, wish to 
choose the route containing the minority of traffic [1]. 

At each round of the game, every individual has to 
choose whether to be in room '0' (e.g., choosing to sell an 
asset or taking route A) or in room '1' (e.g., choosing to 
buy an asset or taking route B). At the end of each turn, 
agents belonging to the smaller group (the minority) are 
the winners, each of them gains one point (the "prize"), 
whereas the others lose a point (the "fine" ) . The agents 
have a common 'memory' look-up table, containing the 
outcomes of recent occurrences. Faced with a given bit 
string of recent occurrences, each agent chooses the out- 
come in the memory (the so-called "predicted trend") 
with probability p, known as the agent's "gene" value 
(and the opposite alternative with probability 1 — p). 

The evolutionary formulation of the model (EMG) 
[5,15] allows agents to adapt their strategy according to 
their past experience: if an agent score falls below some 
value D (the mutation threshold), he mutates - its gene 
value is modified. In this sense, each agent tries to learn 
from his past mistakes, and to adjust his strategy in order 
to survive. 

In previous studies of the EMG, the criterion accord- 
ing to which each agent decided whether or not to change 
his strategy was based on his performance in all previ- 
ous rounds of the game, giving equal weights to each of 
these rounds. Such a crude criterion lacks the capability 
of quantifying the 'local' performance of an agent (his 
net success in the last few rounds of the game). It may 



therefore lead to situations in which agents are taking the 
wrong decisions (based on the state of the system in the 
far past) without noticing that the system has already 
evolved into a completely different global state. Thus, of 
great interest for the study of realistic systems of compet- 
ing (and evolving) agents are situations in which agents 
are capable of adapting their strategy according to their 
present (local) performance (rather than using a crude 
criterion for mutation, a one which gives equal weights 
to all previous rounds of the game). 

The aim of the present work is to explore the dynamics 
of evolving populations with various levels of adaptation 
(various response times, see a precise definition below) 
and with different values of the mutation threshold. Of 
main importance is the identification of the strategies 
that perform best in a particular situation. 

In the present formulation of the model, each agent 
holds a measure of his past performance through a mov- 
ing average S(t; T), whose value reflects the payoffs from 
recent T rounds of the game. The moving average is 
updated with each turn of the game [21]: 

S(t;T) = ^±S(t-l;T) + ±A(t) , (1) 

where A(t) = ±1 is the agent's payoff at time step t. 
Thus, the information about previous outcomes has a 
'half life' of ~ T turns [the contribution of a given turn 
to S(t;T) falls exponentially with successive rounds]. If 
the moving average of an agent falls below the mutation 
threshold, his strategy (i.e., its gene value) is modified. 
After mutation, the agent enters a "trial period" of T 
rounds before considering mutating again. The muta- 
tion threshold D characterizes the "patience" of an agent. 
The smaller is the value of D the more tolerance (willing 
to suffer some local losses without modifying his strategy) 
is the agent. The value of the parameter T is a measure 
of the agent's level of adaptiveness, his response time to 
temporal changes in the state of the system. The smaller 
is the value of T, the faster is the agent's response to any 
deterioration in his performance. 

Figure 1 displays the long-time averaged gene distri- 
bution P(p) of the agents for a fixed response time. We 
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find three qualitatively different populations, depending 
on the precise value of the mutation threshold D. For 
D < I)' 1 ' (this corresponds to a population of "pa- 
tient" agents, ones who are willing to suffer some tempo- 
rary losses without changing their strategies) the popula- 
tion tends to form a W-shaped distribution (The precise 
value of depends on the value of the response time 
T). Remarkably, we find that this W-shaped strategy 
distribution is dynamically meta-stable. One observes 
that from time to time the system undergoes a short 
and abrupt change into an inverse-U shaped distribution 
(which quickly returns to a W-shaped distribution). On 

(2) 

the other hand, for D > D c ("nervous" agents who 
hurry to change their strategies due to even small lo- 
cal losses) the population tends to crowd around p = h , 
forming a (stable) inverse-U shaped gene distribution. 
This corresponds to "confused" and "indecisive" agents 
(agents that prefer a coin-tossing strategy). There is also 
an intermediate phase [for < D < D^], in which 
P(p) has a U-shape with two symmetric peaks at p = 
and p = 1 - the population tends to self-segregate (this 
corresponds to always or never following what happened 
last time). To flourish in such a population, an agent 
should behave in an extreme way. 
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FIG. 1. The strategy distribution P(p) for different values 
of the mutation threshold. The results are for TV = 10001 
agents, and a fixed response time of T = 25. Each point 
represents an average value over 10 runs and 20000 time steps 
per run. 

The (scaled) efficiency of the system is defined as the 
number of agents in the minority room, divided by the 
maximal possible size of the minority group, (N — l)/2. 
Figure 2 displays the system's efficiency as a function of 
the mutation threshold D (and for various different val- 
ues of the response time T) . We also display the efficiency 
for agents guessing randomly between room '0' and room 
'1', and for a uniform distribution of agents. There is a 



range of mutation thresholds D for which the efficiency 
of the system is better than the random case. Thus, the 
agents cooperate indirectly to achieve an optimum uti- 
lization of the system's resources. However, there is also 
a range of D values for which the efficiency of the system 
is remarkably lower than that obtained for agents choos- 
ing via independent coin-tosses. Thus, considering the 
efficiency of the system as a whole, the agents would be 
better off not adapting their strategies because they are 
doing worse than just guessing at random. 
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FIG. 2. The efficiency of the system as a function of 

the mutation threshold D. Horizontal lines represent the 

efficiency for uniform P(p) distribution (dashed) and a 

coin-tossing situation (dashed-dotted). Initially, there is a 

uniform distribution of the strategies. The results are for 

N = 10001 agents. Each point represents an average value 

over 10 runs and 20000 time steps per run. 

Figure 3 displays the system's efficiency as a function 
of the response time T (and for various different values 
of the mutation threshold D). Note that the system's 
global efficiency is a monotonic increasing function of 
the response time for intermediate values of the muta- 
tion threshold. However, for systems composed of ner- 
vous agents (large D values), and for systems composed 
of patient members (very small D values), the utiliza- 
tion of the system's resources is optimal for intermediate 
response times. 

Next, we relax the condition that all members have 
the same (common) response time. We consider a pop- 
ulation of competing and evolving agents in which each 
individual is free to adapt a personal response time in 
a range 1 < Ti < T rnax . Figure 4 displays the system's 
efficiency as a function of the mutation threshold (which 
is still common to all members of the population). For 
comparison, we also display the efficiency of an homo- 
geneous population in which all agents have the same 
response time. One finds that allowing each agent to 
choose his own personal response time may improve the 
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FIG. 3. The efficiency of the system as a function of 
the (common) response time T. Horizontal lines represent 
the efficiency for uniform P(p) distribution (dashed) and a 
coin-tossing situation (dashed-dotted). Initially, there is a 
uniform distribution of the strategies. The results are for 
N = 10001 agents. Each point represents an average value 
over 10 runs and 20000 time steps per run. 



global efficiency of the system. Note however, that for 
intermediate values of the mutation threshold, this free- 
dom (to choose a personal response time) may cause a 
decrease in the system's global efficiency. 

Finally, we consider the case of a free society (market) 
in which each member is allowed to choose both his per- 
sonal response time and his mutation threshold as well. 
Figure 5 displays the winning probability of an agent in 
such a population as a function of his personal mutation 
threshold. We find that agents with small (negative) val- 
ues of the mutation threshold D perform best. These are 
"patient" agents who are willing to suffer some tempo- 
rary losses without modifying their strategy. 

In Fig. 6 we display the winning probability of an agent 
as a function of his personal response time. One finds 
that in such free populations agents with short response 
times perform best. In fact, their winning probability 
exceeds 50%. It turns out that these agents asses their 
performance very often, which allows them to respond 
quickly and efficiently to any change in the global state of 
the system. On the other hand, the winning probability 
has a minimum at intermediate values of the response 
time. [Note however, that in a population composed of 
agents with only short response times (T max = 8 in Fig. 
6) , it is best to have the largest response time available] . 

In summary, we have explored the dynamics of com- 
plex adaptive systems with various different values of re- 
sponse times and mutation thresholds. The main results 
and their implications are as follows: 

(i) A population of "patient" agents [D < Dc] tends 
to form a W-shaped distribution of strategies. The W- 
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FIG. 4. The efficiency of the system as a function of the 
(common) mutation threshold D. Agents have a common 
mutation threshold, but different response times 1 < Ti < 50. 
For comparison we also display the efficiency of a system com- 
posed of agents with a common response time of T = 50 
(dashed-doted curve). Horizontal lines represent the efficiency 
for uniform P(p) distribution (dashed) and a coin-tossing situ- 
ation (dashed-dotted). The results are for N = 10001 agents. 
Each point represents an average value over 100 runs and 
20000 time steps per run. 



shaped gene-distribution is intriguing in the sense that it 
does not appear in adaptive systems in which agents asses 
their performance according to all previous rounds of the 
evolution [5,15]. This is a new feature of the present 
model. 

On the other hand, a population of "nervous" agents 
[D > D^] tends to cluster around p — | (a coin-tossing 
strategy). Stated in a more pictorial way, confusion and 
indecisiveness take over in nervous systems. 

(ii) An evolving population achieves an optimum uti- 
lization of its global resources for small negative values 
of the mutation threshold D (see Fig. 2). This corre- 
sponds to a population of patient members. For large 
D values agents tend to be indecisive (preferring a coin- 
tossing strategy), a behavior which destroys any attempt 
to establish (indirect) cooperation. It seems that "ner- 
vousness" prevents the agents from achieving a reason- 
able utilization of their resources. 

(iii) In a free society of competing agents (in which each 
member has the freedom to adapt his own response time 
and mutation threshold) patient agents perform best (see 
Fig. 5). 

(iv) The best performance is achieved by agents who 
have very short response times (see Fig. 6). These agents 
have a high level of adaptiveness, making it possible for 
them to response quickly and efficiently to local changes 
in the state of the system. The success rate of such agents 
actually exceeds 50%. (Agents who have very long re- 
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FIG. 5. The winning probability of an agent as a function 
of his mutation threshold. Each agent is free to adapt a per- 
sonal mutation threshold and a personal response time. The 
results are for N = 10001 agents. Each point represents an 
average value over 100 runs and 20000 time steps per run. 

sponse times also perform reasonably well, whereas the 
winning probability drops to a minimum at intermediate 
values of the response time). 
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FIG. 6. The winning probability of an agent as a function 
of his response time. Each agent is free to adapt both a 
personal mutation threshold and a personal response time. 
The parameters are the same as in Fig. 5. 
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